Estimating dissipation from single stationary trajectories. 
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In this Letter we show that the time reversal asymmetry of a stationary time series provides 
information about the entropy production of the physical mechanism generating the series, even if 
one ignores any detail of that mechanism. We develop estimators for the entropy production which 
can detect non-equilibrium processes even when there are no measurable flows in the time series. 
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The relationship between irreversibility and entropy 
production forms the core of thermodynamics and sta- 
tistical mechanics. However, it had not been formu- 
lated quantitatively until the recent introduction of the 
Kullback-Leibler distance or relative entropy in the con- 
text of fluctuation and work theorems [T]. The relative 
entropy between two probability distributions, p{x) and 
q(x) is denned as 

P{x) 
q{xY 



(i) 



and is a measure of their distinguishability [5] . The aver- 
age entropy production associated with a process driven 
by an external agent turns to be equal to the relative 
entropy between the two probability distributions de- 
scribing the process running forward and backward in 
time [U |3HS] • This relative entropy can be thought of as 
the distinguishability between the process and its time 
reverse, i.e., as the irreversibility exhibited by the pro- 
cess. The relationship between entropy production and 
relative entropy has been derived in different scenarios: 
Hamiltonian dynamics [TJ [3] and Langevin dynamics [5] , 
and has also been tested in experimental situations [5]. 

When applied to non-equilibrium stationary states 
(NESS), the entropy production per unit time reads 
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lim —D 
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where k is the Boltzmann constant and p Nx(t)}*_ ^ ^ s 

the probability of observing a given trajectory {x(t)}J. =0 
in phase space. Since we focus on stationary trajecto- 
ries — where the external forcing, if any, is constant — , 
there is no need of reversing the driving in the backward 
process. Moreover, a sufficiently long single trajectory 
can provide all the necessary statistics to compute the 
relative entropy in Eq. ^ and consequently the entropy 
production rate. 

Fortunately, the full information of the trajectory in 
the phase space is not always necessary. Eq. ^ follows 
immediately from the Gallavotti-Cohen theorem [7], by 
replacing the relative entropy between trajectories with 
D(ps{s)\\ps(— s)), where ps(s) is the probability to ob- 
serve an entropy production s in a time interval [0, t\. In 



general, the relative entropy calculated using partial in- 
formation, {x( r )}t=o where x(t) is a non- invert ible func- 
tion of x(r), only provides a lower bound on the average 
entropy production [HEME]. For stationary trajectories, 
instead of Eq. ^ one obtains a lower bound, which is met 
if x(t) univocally determines the entropy production s. 

For discrete stationary trajectories Xi, . . . ,x n , we can 
define the relative entropy of n-strings as 



p(x l7 ...,x n ) 
p{x ni . . . ,Xi) 



Following the above arguments, we arrive at: 



(3) 



^ > d( PF \\ PB ) = lim -D n (p F \\ PB ). (4) 

This equation reveals a striking connection between 
physics and the statistics of a time series. The l.h.s. 
is a purely physical quantity (it is proportional to the 
average dissipated energy per step), whereas the r.h.s 
is a statistical magnitude depending solely on the data 
xi,X2, ■ ■ ■, but not on the physical mechanism generat- 
ing those data. Such a connection is a generalization 
of the Landauer's principle relating entropy production 
and logical irreversibility [TJ [TU] . Eq. Q extends this 
principle and suggests that we can determine the entropy 
production of an arbitrary NESS by computing the rel- 
ative entropy of forward and backward trajectories. We 
could, for instance, determine whether a biological pro- 
cess is active or passive or even estimate, or bound, the 
amount of consumed ATP by measuring the relative en- 
tropy of data generated in the process. 

In this Letter we explore the feasibility of such a tech- 
nique by analyzing the validity of Eq. Q and developing 
estimators of the relative entropy. Our approach is gen- 
eral, but we use a discrete flashing ratchet as a case study, 
wherein direct comparison between analytical and empir- 
ical values of the relative entropy and the entropy pro- 
duction is possible. There have been previous attempts 
to distinguish between equilibrium and NESS. Martin et 
al checked the fluctuation dissipation relationship in ex- 
perimental data from hair bundles of hair cells [llj , but 
this approach needs two types of data: spontaneous and 
forced fluctuations. Amman et al analyzed the possibility 
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to discriminate between equilibrium and non-equilibrium 
in a three state chemical system [12] . Finally, Kennel in- 
troduced in [13] criteria based on compression algorithms 
to distinguish between symmetric and asymmetric time 
series in the context of chaotic signals, without any con- 
nection to dissipation. As we show in this Letter, relative 
entropy provides a more general and simpler framework 
for the problem of distinguishing between equilibrium 
and NESS and, moreover, yields estimations and lower 
bounds on the entropy production. 

Two strategies have been considered to estimate the 
relative entropy between stochastic processes: the first 
is based on brute-force counting of n-strings, obtaining 
empirical estimates of p(xi, . . . ,x n ), and computing D n 
using Eq. ([3]); the second is based on string parsing, the 
basic procedure of the Lempel-Ziv compression algorithm 

The first strategy is simpler and more effective for 
Markov chains. Our results indicate that this is still the 
case for some non-Markov process [TS]. Consequently, 
we will restrict ourselves in this Letter to estimations of 
relative entropy from empirical probability distributions. 

If the process and its reverse are Markovian, 
p(xt,x 2 , ...,£„) = p(ii)p(a;2|a;i)...p(x ri _i|a; n ), the rel- 
ative entropy rate d defined in Eq. Q can be expressed 
in terms of the relative entropy between distributions of 
substrings of size 2: 



d(pF\\PB) 
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In the specific case of a trajectory and its reverse, the one- 
time statistics are identical and Di(p F \\p B ) = 0. Then 
for Markovian dynamics cI(pf\\pb) = D 2 , which can be 
calculated by frequency counting if the number of states 
and possible transitions is not large. In general, if one 
defines 



dk = D k - D k 



fe-i 



(6) 



then dk — > d for k — >■ oo. The limit is reached for finite 
k for the so-called k-th order Markov chains, i.e. when 
blocks of size k, Xk = (x n , . . . , x n+ k-i), are Markovian 

[16]. In this case cI(pf\\pb) = ^fc+i = dk+ 2 = For 

more general processes, we will use the following ansatz, 
proposed by in Ref. |17j for Shannon entropy estimation: 



log k 



(7) 



where c and 7 are parameters that, together with dao, 
can be obtained by fitting the empirical values of dk vs. 
k. 

We have tested the accuracy of these estimators and of 
the bound Q in a specific example: a discrete flashing 
ratchet |18j . consisting of a particle moving in a one di- 
mensional lattice. The particle is at temperature T and 




FIG. 1: Discrete ratchet scheme. Particles can jump between 
the states i — > j, i' — > j' , and i — > i' in a flashing asymmet- 
ric potential of height 2V with periodic boundary conditions. 
The switching rate of the potential is r. 
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FIG. 2: Average dissipation per step (in units of kT) in the 
flashing ratchet (r = 1) and different estimations of relative 
entropy using a trajectory with n = 10 6 steps and full in- 
formation, as a function of V/kT: analytical calculation of 
the average dissipation (black line), di (blue circles), dz (red 
squares) . 



moves in a periodic and asymmetric potential of height 
2V, which is switched on and off at a rate r (see Fig. [I]). 
Trajectories are described by two variables: the position 
of the particle, x = {0, 1, 2}, and the state of the poten- 
tial (on or off), y = {0, 1}. 

To define the dynamics of the particle, we start with 
a continuous time description based on rates of spatial 
jumps and switching. We assume that the motion in each 



and 



potential obeys detailed balance: ki^ 3 
ki'-^ji = 1 for i,j = 0,1,2 with i j. The system is 
driven out of equilibrium by imposing constant switching 
rates = = r, i = 0, 1, 2, which do not obey 

detailed balance. 

We will focus on the dissipation per step: from the 
continuous trajectory (x(t),y(t)) we generate a series 
(x n , y n ) comprising the states visited by the system. 
That is, we drop the information of the times when jumps 
or switches occur. (x n , y n ) is a Markov chain with transi- 



tion probabilities given by p Q _ 



y/Hy k a->y, With 



a, 7 = 0, 1, 2, 0', 1', 2'. Introducing these probabilities in 
Eq. ([5|, d(pF\\pB) = P^2{V a — V y ), where the sum runs 
over transitions mediated by the thermal bath, i — > j, 
i' — > j' . The relative entropy turns out to be the aver- 
age dissipation per step in units of kT and we recover 
the main result, Eq. ^ [22] . It is also interesting to ex- 
plore the relationship between d 2 and the stationary flows 
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FIG. 3: Average dissipation per step (in units of kT) in the 
flashing ratchet (r = 1) and different estimations of relative 
entropy using a trajectory with n — 10 7 steps and partial in- 
formation (position) as a function of V/kT: analytical calcu- 
lation of the average dissipation (black line), cfo (blue circles), 
dg (green diamonds), doo in Eq. ^ (orange circles with error 
bars), and Monte-Carlo semi-analytical calculation of d (pur- 
ple crosses). Inset. Estimators for weak potentials in a log-log 
plot. We have added in the inset the analytical calculation of 
d2 (blue solid line). 



J a -y = Pay ~ P-ya between states a, 7 = 0, 1, 2, 0', 1', 2'. If 
Ja-f "C Pay, we have: 



2p ai 



(J Q7 ) 2 



a<7 



Pay 



(8) 



which is a well known expression of the entropy produc- 
tion in continuous Markov systems ['19 , where d 2 = d. 

Fig. [2] shows the dissipation, calculated analytically 
by solving the six-state Markov chain in the station- 
ary regime, and the estimations discussed above. Due 
to Markovianity, relative entropies, dk, immediately con- 
verge d = c?2 = <^3 = ■ • ■ and d is equal to the entropy 
production per step. As long as one has a good esti- 
mation of p(x%, . . . , Xfc), our approach provides accurate 
values of the entropy production, which is the case for 
weak potentials V ~ kT. If V 3> kT, then uphill jumps, 
— > 1, — > 2, and 1 — > 2, are so unlikely that they do not 
occur in a finite trajectory. The higher order the statis- 
tics, the earlier this problem arises, as shown in Fig. [2] 
The reason is that d^ involves probability distributions of 
three-step trajectories, the sampling space is bigger and it 
is easier that some transitions i — > j — > k do not appear 
while their reverse do. Although these jumps are very 
unlikely, they contribute significantly to d, as shown in 
Fig. [2] where di and dz have been calculated by restrict- 
ing the sum in Dk to strings satisfying p(x\ . . . Xk) 7^ 
and p(xk ■ ■ - X\) 7^ 0. 

In real applications, it is more likely that one has only 
partial information of the trajectories. To study the ac- 
curacy of our estimators and of the inequality Q in this 
case, we remove the information of the state of the po- 



FIG. 4: Average dissipation per step (in units of kT) in the 
flashing ratchet (r = 2, V = 2kT) with external force F 
and different estimations of relative entropy using a trajec- 
tory with n = 10 7 steps with partial information (position): 
analytical calculation of the average dissipation (black line) , 
d2 (blue circles, analytical values in blue dashed line), ds (red 
squares), dg (green diamonds), and semi-analytical calcula- 
tion of d (purple crosses). The minimum in di corresponds to 
the stall force. 



tcntial and consider trajectories described only by the 
position {xk}u—i, which in general are not Markovian. 
As a consequence, the estimation of the relative entropy 
d(pp\\pB) is more difficult, but even a good estimation of 
d only provides a lower bound on the relative entropy. It 
is known that the Gallavotti-Cohen symmetry does not 
hold in the continuous flashing ratchet if the state of the 
potential is not considered j2D|. In fact, the bound Q 
can be quite loose. For instance, if r — > 00, switching is 
very fast and the particle moves in an effective potential 
(the average of on and off) which is periodic. The po- 
sition Xk becomes Markovian and the current vanishes. 
Using Eq. Q one arrives at d = d 2 = 0, whereas the 
dissipation per step is non-zero. 

In most cases however the bound given by Eq. Q pro- 
vides significant information. In Fig. [3] we show the esti- 
mation of d using the empirical values of dk for k — 2, 9, 
and the extrapolation d x resulting from the fit of the 
ansatz in Eq. Q . The error bars in Fig. [3] correspond 
to the error in the fit with a confidence interval of 90%. 
Our estimations clearly distinguish between the equilib- 
rium case (V — 0) and the NESS. The empirical dk with 
k > 3 correctly reproduce the order of magnitude of the 
actual dissipation (see inset in Fig. [3| , although they un- 
derestimate it. There are two possible causes for this 
deviation: either we are underestimating the actual rel- 
ative entropy d, or the bound provided by Eq. Q is not 
tight. To clarify this question we need an analytical cal- 
culation of the relative entropy between two non-Markov 
processes. In our case, the relative entropy D n reads: 



D n = log 



E V1 v „ P{x x ,vi; ...;x n , y n ) \ 
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p(x n ,y n ; . . .;xi,yi) 
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where the average is taken over all possible trajecto- 
ries. The probability distribution p(x\, y\; . . . ;x n , y n ) = 
p{x 1 ,y 1 ) x p(x2,y2\xi,yi) x ... x p(x n ,y n \x n -i,yn-i) is 
known, but Eq. Q cannot be calculated exactly. For- 
tunately, the log in Eq. ^ is a self- averaging quantity 
for large n [ST] and we can compute the average using a 
single long typical trajectory [T5]. We show in Fig. [3] the 
value of d obtained by this Monte-Carlo semi-analytical 
calculation (purple crosses), which is very close to the 
estimation rf^ based on the ansatz Eq. ([7]). 

Although the relative entropy d underestimates the ac- 
tual dissipation, it does reproduce its asymptotic behav- 
ior. Entropy production decreases as V 2 for small V, so 
do g?oo and d g (see inset of Fig. [3| . On the other hand, 
d 2 oc V 6 , since the current is J a V 3 (see Eq. Q). 

We have found in several instances a similar qualitative 
improvement on the estimation of relative entropy when 
using blocks of size bigger than two. In particular, c?3 
and above outperform d 2 , which, as indicated by Eq. (JsJ, 
is equivalent to the standard calculation of entropy pro- 
duction using the currents observable from the available 
data; in our case, the spatial current. For a striking il- 
lustration of this effect we add an external force F to the 
flashing ratchet and study dissipation and relative en- 
tropy close to the stalling force F sta n, for which the spa- 
tial current and d 2 both vanish. Jumping rates are now 
biased in the direction of the force, giving the following 
detailed balance condition k^j/kj^i — e~P( y i~ Vi ~ FL4 - s ' , 
Lij = 1 being the distance between i and j. 

We have plotted in Fig. [4] the real dissipation, the an- 
alytical value of d and d 2 and the empirical values of d 2l 
ds, and dg, close to the stalling force install- Recall that, 
for F — i^taii, the position of the particle does not ex- 
hibit any flow and its average position remains constant. 
Consequently, d 2 or any other estimation of entropy pro- 
duction based on flows will fail. However, the relative 
entropy calculated using blocks of size 3 captures the 
non-equilibrium nature of the time series. 

In conclusion, we have shown that the statistical prop- 
erties of a time series impose a lower bound on the 
entropy produced in generating the series. This lower 
bound is valid even if we do not have any access or infor- 
mation of the physical mechanism generating the data. 
Finally, we have shown that the bound can be non-trivial, 
predicting dissipation even when the data do not exhibit 
any measurable flow. Our techniques could be applied to 
data from different sources. In the case of biological sys- 
tems, they could help to distinguish between passive and 
active processes, and even to estimate ATP consumption. 
On the other side, as in the case of Landauer's principle, 
relative entropy can be used to ascertain the minimal 
entropy production associated with a specific behavior, 
such as spatiotemporal patterns, excitable systems, etc. 
This in turn may influence the design of optimal devices 
with functionalities given by these behaviors. 
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